Skip to content

Conversation

@martinkilbinger
Copy link
Contributor

@martinkilbinger martinkilbinger commented Feb 2, 2026

Summary

This PR updates job hanling on canfar to the new canfar python library system.

This comes with job submission and monitoring classes and scripts.

New tiles (P9) were added to the CFIS tile list.

Closes some old issues #669 .

Reviewer Checklist

Reviewers should tick the following boxes before approving and merging the PR.

  • The PR targets the develop branch
  • The PR is assigned to the developer
  • The PR has appropriate labels
  • The PR is included in appropriate projects and/or milestones
  • The PR includes a clear description of the proposed changes
  • If the PR addresses an open issue the description includes "closes #"
  • The code and documentation style match the current standards
  • Documentation has been added/updated consistently with the code
  • All CI tests are passing
  • API docs have been built and checked at least once (if relevant)
  • All changed files have been checked and comments provided to the developer
  • All of the reviewer's comments have been satisfactorily addressed by the developer

@martinkilbinger martinkilbinger changed the title P9 Canfar Python Libraries Feb 2, 2026
@martinkilbinger martinkilbinger linked an issue Feb 2, 2026 that may be closed by this pull request
This was referenced Feb 4, 2026
Copy link
Contributor

@cailmdaley cailmdaley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work modernising the CANFAR submission path and replacing CDSClient with astroquery. The documentation rewrite is a real improvement. A few things to fix before merge — mostly runtime bugs that will crash in production, plus some smaller items.

Bugs that will crash

  • canfar_monitor.py:130RunTimeErrorRuntimeError
  • canfar_monitor.py:188self._get_kind()self.get_kind() (no underscore)
  • canfar_monitor.py:121self._params["verbose"] referenced but never defined in params_default
  • canfar_monitor.py:206-208else after try/except runs on success, not failure. The "Failed to destroy" message fires when destruction succeeds.
  • canfar_monitor.py:222,235"failed{estr}" missing f prefix
  • summary.py:218,235mgs instead of msgNameError at runtime
  • clear_ngmix_prev.py:117 — globs ngmix_out_dir when it should glob prev_out_dir
  • distribute_tiles.py:124 — compares string args with int literals (0, 1) — will never match, should be ('0', '1')

Intentional?

  • make_cat.py:348 — threshold changed from 0.9 to 0.0. Since a ratio can't be negative, this disables the check entirely. If intentional, the dead code block could be removed or at least commented with why.

Style / design

  • distribute_tiles.py:1 — hardcoded shebang #!/arc/home/kilbinger/...
  • uncompress_fits.py:62 — bare except: — worth catching something specific
  • uncompress_fits.py:75-76 — the TFORM fix runs outside the if name: guard, so on empty keys value/comment are stale from the previous iteration
  • canfar_submit.py:32-33 — the != 512 guard can never trigger (set to 512 on the line above)
  • canfar_submit.py:260AsyncSession() opened twice (once in run_async, again in _submit_single_batch)
  • pyproject.tomlrequires-python downgraded from >=3.11 to >=3.10 without mention. Also canfar, sf_tools, h5py, pandas added as core deps — could these live under an optional [canfar] extra?
  • Star imports in summary_run.py and summary_params_pre_v2.py

Typos

  • canfar_monitor.py:3 — "Montiot" → "Monitor"
  • canfar_monitor.py:247 — "Retreiving" → "Retrieving"
  • mask.py:498 — "astroquer" → "astroquery"
  • canfar_submit.py:116 — "SESSSION" (triple S)
  • mask.py:496-501IndexError raised with positional args instead of a formatted string

Tests

None of the new code has tests. The mgs/msg typo in summary.py would have been caught immediately.

@martinkilbinger
Copy link
Contributor Author

Check again please @cailmdaley, I answered to all points of your PR report. Thanks!

martinkilbinger and others added 2 commits February 6, 2026 07:48
- canfar_monitor: "Montior" typo, try/except/else flow for bulk destroy
- canfar_submit: "debut_out" dict key typo
- uncompress_fits: header assignment inside if-name guard
- distribute_tiles: check arg value not flag name for dry_run

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@cailmdaley
Copy link
Contributor

Hey Martin, thanks for the fixes! I took the liberty of pushing a few more small ones I spotted while verifying:

  • canfar_monitor.py — "Montior" → "Monitor" (the old typo became a new one), and the bulk destroy try/except still printed "Success" unconditionally (added an else: clause)
  • canfar_submit.py — dict key debut_outdebug_out
  • uncompress_fits.pyheader[name] = (value, comment) was still outside the if name: guard
  • distribute_tiles.py — dry_run check was comparing the flag name (-n) against ("0", "1") instead of the flag's value

Let me know if I made any mistakes. Have a nice weekend!

@martinkilbinger martinkilbinger marked this pull request as ready for review February 7, 2026 09:10
Copy link
Member

@sfarrens sfarrens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @martinkilbinger sorry for the delay. I am a bit rusty on ShapePipe 😅, so I may have missed something, but overall the proposed changes look fine. I opened a few threads to resolve some minor issues.

from shapepipe.utilities.summary import *

from summary_params_pre_v2 import *
from shapepipe.utilities.summary_params_pre_v2 import *
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really important for this PR, but in general it is not good practice to use import * as it makes it harder to trace the origin of objects. It would be better to from shapepipe.utilities import summary_params_pre_v2 if you need the whole module or to simple import the objects you actually need.

Main program
"""
# Scripts to call canfar classes are created by pyproject.toml
return 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the point of this function?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine, but strictly speaking there should be some docstrings to explain the point of the module and the corresponding functions.

self._patch = os.environ["patch"] if "patch" in os.environ else "P0"

# Set job parameters
version = "1.1"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this hardcoded here?

print(f"Each session will process ~{math.ceil(total_n / num_replicas)} jobs using chunk()")
print("Sessions = ", sessions)
except Exception as e:
print(f"❌ CANFAR session.create() failed: {type(e).__name__}: {e}")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be surprised if printing emojis worked universally, but fine if it works where you need to run it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't spot any major issues, but more docstrings would be good.

Comment on lines 83 to 84
def check_params(self):
pass
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the point of this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same point re: docstrings

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] canfar minor issues

3 participants